HashTables

So I just finished the 3 part series on how hash maps work. From this I thought I'd write a few notes, both in the hopse of silidifying my understanding a bit, and also as a way for me to be able to reference this in the future.

So I learned about two types of hash tables.. Open addressing hash tables, and Chaining hash tables. So the former is a way to implement a hash table with an array, and the latter uses linked lists.

Open addressing hash tables:

Open adressing hash tables are a way to use arrays as your system for storing data. You use a hash function that hashes you key to some value. This method is deterministic, thus it works the same exact way every single time. This function will return you an index in the array to store this value. Now... there is a probibility that there will be another value already in that perticular slot. If this is the case you will step down to the next array index and check if it's filled... if it's not you'll put that value there. If it is, you'll continue this process until you find an empyt slot. Because this is a repeatable process you'll be doing the same thing with you search method to find a hash value, and with your delete function that removes a key/value. Now the one caviot to this is that when you delete a key.. you need to give it a new value that's not the same "none" that the search is looking for, because it might get a hash pointing to that location for a value that does not exits there, but it exists in another part of the table, but you messed up the walking process down the table because of this delet, so you set a different flag on said deleted value that tells the search hey... you need to skip over me.. I'm not the value you are looking for. This does not break the search, because if the key that it's actually looking for is the one you deleted it will hit a point in the table where there is an actual none, or the table is ended. Now this new flag you put in for a deleted key has that effect on search, but not on insert. Insert know that it can save something there.

There is one important thing to note about open table adressing.. with the above implementation you can get into a situation where there are lots of clusters of data in this hash table, so you end up do long walks to find open keys or searching, thus it ends up being pretty ineffecent. So there's a way around that. It's called Double hashing I need to look into it more to understand it better, but it apparently solves this to some degree. The above method I described is called Linear Probing, because you are stepping linearly to find open slots in the table.

Linked List hash tables

These are very similar to the above system I described, but when you get colissions in the hash table you start adding nodes to a linked list insted of finding open slots in the array.


In [ ]: